Goto

Collaborating Authors

 Madison County


Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries

Kim, Junhyuck, Park, Jongho, Cho, Jaewoong, Papailiopoulos, Dimitris

arXiv.org Artificial Intelligence

We introduce Lexico, a novel KV cache compression method that leverages sparse coding with a universal dictionary. Our key finding is that key-value cache in modern LLMs can be accurately approximated using sparse linear combination from a small, input-agnostic dictionary of 4k atoms, enabling efficient compression across different input prompts, tasks and models. Using orthogonal matching pursuit for sparse approximation, Lexico achieves flexible compression ratios through direct sparsity control. Lexico maintains 90-95% of the original performance while using only 15-25% of the full KV-cache memory, outperforming both quantization and token eviction methods. Notably, Lexico remains effective in low memory regimes where 2-bit quantization fails, achieving up to 1.7 better compression on LongBench and GSM8K while maintaining high accuracy. Figure 1: Memory usage vs. performance of Lexico compared to other key-value (KV) cache compression methods on GSM8K. The figure illustrates the relationship between KV cache size and the performance of Lexico on Llama models on GSM8K 5-shot evaluation. Lexico consistently outperforms both eviction-based methods (SnapKV, PyramidKV) and quantization-based methods (per-token quantization, KIVI, ZipCache). Transformers (Vaswani et al., 2017) have become the backbone of frontier Large Language Models (LLMs), driving progress in domains beyond natural language processing. However, Transformers are typically limited by their significant memory requirements. This stems not only from the large number of model parameters, but also from the having to maintain the KV cache that grows proportional to the model size (i.e., the number of layers, heads, and also embedding dimension) and token length of the input.


Hear a good Sunday sermon? AI ready to make preacher's words count all week long

FOX News

'The Five' co-hosts discuss new AI bot ChatGPT and the impact artificial intelligence will have on future jobs. Church leaders and volunteers will soon have access to an artificial intelligence platform that aims to shave hours off their day-to-day tasks by generating content from sermons to engage fellow Christians when they are not in the pews. Upcoming platform Pulpit AI, founded by Michael Whittle, is expected to launch later this summer and will serve as a tool for Christian leaders looking to take the tedious work out of crafting religious blog posts, devotionals and prayer guides and social media posts. "We want to help pastors of small to medium-sized churches be able to make content for their congregations to interact with throughout the week and on social media," Whittle told Fox News Digital. "We think every pastor should, if they want, have a digital signal to their congregations beyond the sermon. "Most small to medium-sized churches have small or completely volunteer staff, so they have zero operational leverage when it comes to media and resources for their church," he added. "If we can help a church media team get past the blank page, we can not only save them crazy amounts of time, we can help every church become a resourcing church for their people." 'AI JESUS' TALKS DATING, RELATIONSHIPS, MORALS -- EVEN OFFERS VIDEO-GAMING TIPS A congregant reads a referred passage from her Bible during services at Highland Colony Baptist Church in Ridgeland, Mississippi, Nov. 29, 2020. Puplit AI "doesn't and never will" generate sermons, instead it serves as a tool where the user uploads a sermon or religious podcast in order to repurpose it into "social media highlights, blog posts, discussion questions, and the other content churches use to reach their congregations and communities day in and day out," Whittle said. "Pulpit AI analyzes long form audio and video, then repurposes that into various forms of content," Whittle said. "Pulpit AI's output is taken directly from the source material.


Vu Digital enhances Vu for Law Enforcement by adding A.I. layer to identify and predict relevant events from digital evidence

#artificialintelligence

Police and prosecutors face a daunting task of identifying, managing and reviewing a veritable tsunami of digital evidence generated daily by officer bodycams, jail house calls, interrogation videos, dispatch calls and closed caption television, which stretches staffing capabilities and resources. Today, Vū Digital for Law Enforcement's A.I. capabilities deliver a robust and dynamic evidence management solution for law enforcement agencies worldwide. The layer recognizes and identifies events, keywords and sequences of words generated by Vū's automated tagging engine to tag and identify instances of significance, including confessions, identification of personal information and references to certain "trigger" words. "Data can exist without A.I., but not vice versa," said Wade Smith, vice president of operations for Vū Digital. "First, we create data where it otherwise didn't exist, then we apply an artificial intelligence layer to identify relevant events from the digital evidence. In the end, artificial intelligence is only as powerful as the data it considers."


Multi-Instance Multi-Label Learning

Zhou, Zhi-Hua, Zhang, Min-Ling, Huang, Sheng-Jun, Li, Yu-Feng

arXiv.org Artificial Intelligence

Nanjing University, Nanjing 210046, China Abstract In this paper, we propose the MIML (Multi-Instance Multi-Label learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Considering that the degeneration process may lose information, we propose the D-MimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming single-instances into the MIML representation for learning, while SubCod works by transforming single-label examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the single-instances or single-label examples directly. Email: zhouzh@lamda.nju.edu.cn 1 Introduction In traditional supervised learning, an object is represented by an instance, i.e., a feature vector, and associated with a class label. Formally, let X denote the instance space (or feature space) andY the set of class labels. In particular, each object in this framework belongs to only one concept and therefore the corresponding instance is associated with a single class label. However, many real-world objects are complicated, which may belong to multiple concepts simultaneously. For example, an image can belong to several classes simultaneously, e.g., grasslands, lions, Africa, etc.; a text document can be classified to several categories if it is viewed from different aspects, e.g., scientific novel, Jules Verne's writing or even books on traveling;aweb page can be recognized as news page, sports page, soccer page, etc. In a specific real task, maybe only one of the multiple concepts is the right semantic meaning. For example, in image retrieval when a user is interested in an image with lions, s/he may be only interested in the concept lions instead of the other concepts grasslands and Africa associated with that image. The difficulty here is caused by those objects that involve multiple concepts. To choose the right semantic meaning for such objects for a specific scenario is the fundamental difficulty of many tasks.


Multi-Instance Multi-Label Learning with Application to Scene Classification

Zhang, Zhi-Li, Zhang, Min-ling

Neural Information Processing Systems

In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning.


Multi-Instance Multi-Label Learning with Application to Scene Classification

Zhang, Zhi-Li, Zhang, Min-ling

Neural Information Processing Systems

In this paper, we formalize multi-instance multi-label learning, where each training example is associated with not only multiple instances but also multiple class labels. Such a problem can occur in many real-world tasks, e.g. an image usually contains multiple patches each of which can be described by a feature vector, and the image can belong to multiple categories since its semantics can be recognized in different ways. We analyze the relationship between multi-instance multi-label learning and the learning frameworks of traditional supervised learning, multiinstance learning and multi-label learning.